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Abstract 

Non-containment  for  free  single  variable  program  schemes 
is  shown  to  be  NP-complete.  A polynomial  time  algorithm  for 
deciding  equivalence  of  two  free  schemes , provided  one  of  them  has 
the  predicates  appearing  in  the  same  order  in  all  executions , 
is  given.  However,  the  ordering  of  a free  scheme  is  shown  to 
lead  to  an  exponential  increase  in  size. 
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1.  Introduction 

Much  work  in  the  theory  of  program  schemes  has  gone  into 
the  investigation  of  decidability  properties  for  different 
classes  of  schemes  [G,M] In  the  cases  where  a problem  is 
decidable,  a natural  question  is  to  determine  the  complexity 
of  the  decision  procedure.  Some  of  those  questions  were 
answered  in  [CHS]  where  it  was  shown  that  noncontainment  and 
nonequivalence  for  single  variable  program  schemes  and  for 
monadic  linear  recursion  schemes  are  NP-complete. 

In  this  paper  we  investigate  the  complexity  of  these  two 
problems  for  the  class  of  free  single  variable  program  schemes. 
The  requirement  of  freedom  (i.e.  absence  of  pieces  of  code  which 
cannot  possibly  be  executed) , is  a very  natural  one  if  we  want 
to  consider  schemes  which  are  models  of  real  programs.  Although 
most  real  programs  have  more  than  one  variable,  we  show  that 
even  in  the  single  variable  case  the  equivalence  problem  is 
difficult. 

We  show  that  the  noncontainment  problem  for  free  schemes 
remains  NP-complete.  We  do  not  know  the  complexity  of  the 
equivalence  problem  for  free  schemes  (except  that  inequivalence 
is  in  NP) , but  we  can  reduce  it  to  the  problem  of  determining 
equivalence  of  acyclic  schemes  involving  only  predicates  and 
terminal  assignment  statements.  We  present  a partial  solution 
to  the  equivalence  problem  by  showing  that  if  one  of  the  schemes 
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has  all  predicates  appearing  in  the  same  order,  then  there  is 
a polynomial  time  algorithm.  However,  we  show  that  there  are 


schemes  in  which  ordering  the  predicates  causes  an  exponential 

i* 

increase  in  size,  indicating  that  preprocessing  by  ordering  one 
of  the  schemes  cannot  lead  to  a polynomial  time  algorithm. 

The  paper  is  organized  in  5 sections.  In  section  2 we 
introduce  the  notion  of  a B-scheme,  which  is  an  acyclic 
single  variable  program  scheme  containing  only  predicates  and 
terminal  assignment  statements.  Section  3 contains  the  proof 
that  noncontainment  for  free  B-schemes  is  NP-complete  as  well 

i 

as  the  polynomial  time  algorithm  for  the  case  where  one  scheme 
is  ordered.  In  section  4 we  present  an  unordered  B-scheme  with 
no  small  equivalent  ordered  scheme,  and  in  section  5 we  show 
that  equivalence  for  the  full  class  of  free  single  variable 
schemes  is  decidable  in  polynomial  time  if  and  only  if  the 

i equivalence  problem  for  free  B-schemes  is  decidable  in 

polynomial  time. 

I Although  this  is  a paper  about  program  schemes,  some  of  the 

results,  notably  the  exponential  blow-up  in  section  4,  are  of 
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2.  Preliminaries 

4 

A B-scheme  is  a labeled  rooted  dag  whose  vertices  have 
outdegree  2 or  0.  Vertices  with  outdegree  2 are  called  tests 
and  are  labeled  with  Boolean  variables;  vertices  with  out- 
degree 0 are  called  leaves  and  are  labeled  by  function  symbols. 

One  edge  from  a test  is  labeled  T,  the  other  F.  |s|  denotes 

the  number  of  nodes  in  scheme  S.  A B-scheme  is  free  if  there 

is  no  path  from  the  root  to  a leaf  which  contains  two  or  more  ; 

tests  with  the  same  label. 

Let  S be  a B-scheme.  A B-assignment  A (assignment  for  short) 
is  a mapping  from  the  Boolean  variables  of  S to  (true,  false}, 
t (A)  is  the  path  constructed  by  starting  at  the  root  and 
selecting  the  edge  labeled  T (F)  whenever  encountering  a test 
labeled  b where  A(b)  = true  (false) . The  value  mapping  Val 
maps  pairs  of  schemes  and  assignments  to  function  symbols  and 
is  defined  as  follows: 

Val(S,A)  = f iff  the  leaf  reached  by  the  path  t (A)  has  label  f . 

The  B-schemes  and  S2  are  equivalent , ( S ^ = S 2 ) , if  and 

only  if  for  each  assignment  A,  whose  domain  contains  all  Boolean 
variables  in  and  S^VaMS^ A)  = Val(S2,A).  One  function 
symbol  (2  is  designated  as  a special  symbol  and  represents  the 
undefined  function.  is  contained  in  S2,  (S^  £ S2)  , if  and 

only  if  for  each  assignment  A whose  domain  contains  all  Boolean 
variables  in  and  S2  ,either  VaMS^A)  = (2  or  VaMS^A)  = 

Val(S2,A)  . 
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We  note  that  if  the  leaves  in  a B-scheme  are  replaced 
by  a HALT-statement , then  we  obtain  the  switching  schemes  of 
[CHS]  . 


3.  Containment  and  equivalence  for  free  B-schemes 


Here  we  show  that  the  containment  problem  for  free 
B-schemes  is  NP-complete,  and  that  in  certain  cases  we  can 
find  polynomial  time  algorithms  for  equivalence. 


Theorem  3.1:  The  set 
BNCONT  = {(S1,S2) 


is  NP-complete. 


and  S2  are  free  B-schemes  and 

S1  £ S2> 


Proof : The  usual  guess  and  check  method  shows  that  BNCONT 

is  in  NP. 

To  show  that  BNCONT  is  NP-hard  we  reduce  3-CNF  satisfiability 

to  it.  Let  F be  a 3-CNF  formula  with  variables  !<2'x2''“xk'  and 

let  x^  appear  uncomplemented  in  F p^  times  and  complemented 

q.  times.  Let  u|,u\...,u*  be  new  variables  and  replace  every 

uncomplemented  occurrence  of  x^  in  F by  a distinct  u1 . Similarly 

let  VwV^,...v^  be  new  variables  and  replace  every  complemented 
qi  i 

occurrence  of  x^  by  a distinct  v . Let  F'  be  the  formula 
obtained  by  replacing  every  x^.  We  will  construct  two  schemes 
and  such  that  S2  iff  the  original  formula  F is 

satisfiable.  Intuitively,  when  / S2' Sl  f°rce  the 

satisfiability  of  the  formula  F'  and  S2  will  enforce  the 
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Now  if  the  original  formula  F was  satisfiable  we  can  find  an 

assignment  A such  that  Val(S2,A)  = g and  Val(S^,A)  = f,  so 

^ S 2‘  Conversely,  if  i S^,  then  there  is  an  assignment 

A such  that  ValfS^A)  = f and  Val(S2,A)  = g.  But  Val(S2,A)  = g 

only  if,  for  each  i,  u?"=u^=.  . ^u1  =v|=...=v1  . Hence  assigning 

^ ^i  *^i 

to  each  x^  the  value  A(up  satisfies  F.  Since  and  S2  can 
be  written  down  in  time  polynomial  in  the  length  of  F,  BNCONT 
is  NP-hard.  ® 

We  now  turn  to  the  equivalence  problem  for  free  B-schemes. 
First  we  show  that  if  the  two  schemes  are  ordered,  then  there 
is  a polynomial  time  algorithm  for  deciding  equivalence. 

Definition  3.2:  A B-scheme  with  Boolean  variables  b, ...b,  is 
1 k 

ordered  if  whenever  a test  labeled  b^  is  a predecessor  of  a 

test  labeled  b.  then  i<j.  ■ 

3 

In  the  proof  of  the  next  theorem  we  use  the  observation 
that  if  a scheme  is  ordered,  then  the  size  of  the  finite 
automaton  accepting  the  interpreted  value  language  [G]  is 
polynomial  in  the  size  of  the  scheme. 

Theorem  3.3:  There  is  a polynomial  time  equivalence  algorithm 
for  ordered  schemes. 

Proof : Let  and  S2  be  schemes  in  which  the  Boolean  variables 

b^...b^  appear.  We  will  construct  deterministic  finite 
automata  and  M2  from  and  S2  such  that  S1=S2  iff 
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L (M^ ) = L (1^2 ) • will  accept  the  string  viV2'*’Vkf  (where 

Vj  is  either  T or  F and  f is  a function  symbol)  iff  Val(S^,A)  = 
f where  A is  the  assignment 


f true  if  V.  = T 
A(b  ) =/ 

1 false  if  = F. 

is  constructed  as  follows.  We  extend  so  that  every 
Boolean  variable  is  tested  on  every  path  from  root  to  leaf. 

We  may  need  to  add  extra  tests  if  (1)  the  root  is  not  labeled 
b^»(2)  there  is  an  edge  from  a test  labeled  b^  to  a test 
labeled  b ^ , and  j>i+l,  or  (3)  there  is  an  edge  from  a test 
labeled  b^  to  a leaf,  and  i<k.  For  example  in  the  second  case 
the  edge 


We  add  a new  accepting  node  and  for  each  leaf  labeled  f an 
edge  labeled  f from  the  leaf  to  the  accepting  node.  Then  the 
resulting  graph  is  the  state  graph,  of  ; nodes  are  states, 
edge  labels  are  state  transitions,  the  test  labeled  b^  is 
the  start  state,  and  the  accepting  node  the  only  accepting  state. 

Since  the  Boolean  variables  are  ordered  it  is  clear  that 
L(M^)  = L (M^ ) iff  S^HSj.  Since  and  M2  can  be  computed  in 
time  polynomial  in  the  size  of  S^  and  S^,  and  equivalence  of 
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deterministic  finite  automata  can  be  done  in  polynomial  time 
[AHU], there  is  a polynomial  time  algorithm  for  ordered  schemes.  ■ 
We  close  this  section  by  proving  that  Theorem  3.3 
remains  true  in  the  case  where  just  one  scheme  is  ordered. 

The  method  can  be  characterized  as  "graph  pushing". 


Definition  3.4:  Let  S be  a free  B-scheme  and  b a Boolean 
variable.  Then  S[b=true]  is  the  scheme  obtained  from  S by 
setting  b to  be  true.  More  precisely: 

1.  For  each  vertex  v labeled  b in  S , do  the  following. 
Delete  v and  any  edges  connected  to  it.  Let  u be 
the  vertex  such  that  (v,u)  was  labeled  T.  If  v was 
the  root,  make  u the  root.  Otherwise  for  each 
vertex  w such  that  (w,v)  was  in  S,  insert  edge 

( w , u ) and  give  it  the  label  of  (w,v) . 

2.  Delete  any  inaccesible  vertices. 

D 

S[b=false]  is  defined  analogously. 


Lemma  3.5:  Let  and  be  free  B-schemes.  Then  S^=S2 
if  and  only  if 

[b=true] HS2 [b=true]  and  [b=false] =S2 [b=false] 
Proof:  Immediate. 


We  now  present  a polynomial  time  algorithm  which  solves 
the  equivalence  problem  for  two  free  B-schemes,  provided  one 
is  ordered. 
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Algorithm  3.6:  , 

Input:  Free  B-scheme  and  ordered  B-scheme  S2* 

Output:  "Yes"  if  the  schemes  are  equivalent,  "No"  otherwise.  ' 

begin 

comment  L is  a list  of  pairs  of  graphs  which  must  be 
equivalent  in  order  that  and  be  equivalent; 

initialize  L to  (S^,S2); 

repeat  \ 

let  n be  a node  of  all  of  whose  predecessors  have  j 

been  marked  and  let  v be  the  subgraph  with  root  n;  j 

let  (v, v^) , . . . , (v, vm)  be  all  the  pairs  of  graphs  on 

L in  which  v occurs  ; j 

comment  since  v, ,v_,...,v  are  subgraphs  of  an  ordered 
1 z m 

scheme,  the  method  in  Theorem  3.3  can  be  used  to 
test  their  equivalence  ; 

if  -i  (v^=V2  = . . . =vm)  then  output  ("No")  and  halt; 
if  v is  a leaf  then 

comment  since  v is  trivially  ordered,  the  method 
in  Theorem  3.3  can  again  be  used  to  test 
equivalence  of  v and  v^; 
if  (vEv^)  then 

output  ("No")  and  halt; 

else 

A:  add  to  L the  pairs  (v' , v^ [b=true] ) and  (v" , v^ [b=f alse] ) 

where  b is  the  label  of  v's  root  n and  v' (v") 
is  the  subgraph  of  reachable  via  n's 
outgoing  T-edge  (F-edge) 

fi; 

remove  the  pairs  (v, v. ),..., (v,v  ) from  L; 

i m 

mark  n; 

until  all  nodes  of  have  been  marked; 
output  ("Yes")  and  halt; 


t 
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Theorem  3.7:  Algorithm  3.6  works  correctly  and  runs  in  poly- 
nomial time  . 

Proof:  It  follows  from  Lemma  3.5  that  the  property 

P:  Sj,  =S2  <=>  V(v,v.)cL  : vSv.^ 

is  an  invariant  for  the  loop.  To  show  correctness  then,  it 
is  sufficient  to  note  that  P is  true  intially  and  that  when 
the  algorithm  stops,  one  of  the  following  is  true: 

a)  all  nodes  have  been  marked,  the  list  L is  empty 
and  the  answer  is  "Yes". 

b)  not  all  nodes  have  been  marked,  there  is  a pair 
(v,v^)  on  L such  that  vjtv^  and  the  answer  is  "No". 

To  see  that  the  algorithm  runs  in  polynomial  time 
observe  that  the  loop  is  executed  at  most  |S^|  times  and  each 
execution  of  the  loop  requires  at  most  |s2|  equivalences  of 
ordered  schemes  which  can  be  done  in  polynomial  time  by 
Theorem  3.3.  ■ 

Note  that  the  freedom  of  guarantees  that  the  graph 
v'(v")  in  the  statement  labeled  A in  the  algorithm  is  equal 
to  v [b=true] (v [b=false] ) . 

4 . A scheme  with  no  small  equivalent  ordered  scheme 

Here  we  construct  a free  B-scheme  whose  smallest 
ordered  equivalent  has  size  "exponential"  in  | Sq | - First  we 
need  some  extra  notation. 

Let  S be  a B-scheme.  A partial  B-assignment  (partial 
assignment  for  shorth  is  a partial  mapping  from  the  Boolean 
varaibles  of  S to  {true, false} . Two  partial  assignments  A^  and 
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A2  are  consistent  if  they  have  the  same  value  whenever  they 

are  both  defined.  The  union  of  two  consistent  partial  assignments 

A^  and  A^uA2,  is  defined  to  be 

A^ (b)  if  A^ (b)  is  defined 

(A1uA2) (b)  = < A2 (b)  if  A2 (b)  is  defined 

undefined  otherwise 
V. 

A partial  assignment  A^  is  an  extension  of  A 2 if  for  each 
Boolean  variable  b,  A2 (b)  defined  implies  A^ (b)  = A2 (b) . 

Let  S be  a scheme.  A partial  assignment  A determines  a 
path  from  the  root  to  a node  which  is  either  a leaf  or  a test 
with  a label  on  which  A is  not  defined.  Nodes  on  this  path 
are  said  to  be  specified  by  A.  Any  node  specified  by  some 
extension  of  A is  said  to  be  reachable  via  A.  Note  that  the  path 
determined  by  A can  not  be  extended  arbitrarily  by  an  extension 
of  A since  certain  tests  not  on  the  path  may  already  be  specified 
by  A. 

Assume  that  n is  a power  of  2.  The  scheme  Sq  will  contain 

2n-l  Boolean  variables  u.,...,u  .,v, ,...,v  . We  say  that  a 

1 n-1  1 n 

partial  assignment  A satisfies  an  equality  u^=vj  if  A(u; ) and 
A (Vj ) are  both  , defined  and  are  equal.  Given  a set  of  equalities 
{u,.  =v,  c..,u.  =v,  } we  construct  the  scheme,  called  a column, 

x*  1 Jl  — — — 

11  mm 

shown  below 
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Note  that  if  A satisfies  all  equalities  then  the  node  labeled 
1 is  reachable  via  A. 

The  scheme  is  now  constructed  in  two  stages 

a)  The  base  of  is  a complete  binary  tree  with  n-1 

interior  nodes  labelled  with  u, . The  leaves 

1 n-1 

are  numbered  from  0 to  n-1. 

b)  The  i 1 th  leaf  is  replaced  by  the  column  obtained  as 

follows.  Remove  from  the  set  of  equalities 

{u  “V.,...  j i u =v . „ , . . ,...,u  =v,  ...x  , } 

1 (l+i)mod  n 2 (2+i)mod  n n-1  (n-l+i)  mod  n 

all  equalities  involving  variables  that  occur  on  the  path 

from  the  root  to  leaf  i,  and  construct  from  the 

remaining  equalities.  Note  that  the  sets  of  equalities 

are  just  cyclic  permutations  of  equalities  between 

(u,  . . ,u  and  {v,,...,v  }. 

l n-l  1 n 

The  following  facts  about  Sq  are  evident 

. 2 

a)  Sg  is  free  and  has  n-1+3 (n-l-logn) • n+2n<3n  nodes. 

b)  No  equality  constraint  appears  more  than  once- 
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c)  Every  path  from  the  root  to  a leaf  labeled  1 is  missing 
log  n variables  among  the  v's. 

Now  let  S.^  be  an  ordered  B-scheme  which  is  equivalent  to 
Sg,  and  let  Y be  the  Vn/2  Boolean  variables  which  come  first 
in  the  ordering.  We  shall  show  that  there  are  "exponentially” 
many  assignments  to  variables  in  Y which  compute  different  functions 
of  the  remaining  variables.  Since  each  of  these  different  functions 
must  be  represented  by  different  nodes  in  must  have 

"exponentially"  many  nodes. 

Relabel  the  variables  such  that  Y = {y  , . . .y  ,—  } and  let  the 

V2 


remaining  variables  be  z = (z.,...z 


1 2n-V^ 


_ } . Call  a column 

/2 


in  Sq  acceptable  if  there  is  no  equality  y^  = y^  between  two 
elements  of  Y appearing  in  the  column.  There  are  at  most 
(Vf>2=  J unaccePtable  columns.  call  an  assignment  A to 
variables  in  Y acceptable  if  there  is  some  acceptable  column 
reachable  via  A. 

Now  we  show  the  key  result  of  this  section,  that  if  two 
acceptable  assignments  are  "a  little  different"  then  they  can 
be  extended  such  that  one  of  them  specifies  a node  labeled  1 
and  the  other  a node  labeled  0. 


Lemma  4.1:  Let  A^  and  A^  be  acceptable  assignments  (to  the 
variables  in  Y)  which  differ  in  more  than  log  n variables.  Then 
there  is  an  assignment  A to  the  variables  in  Z such  that 
VaMA^A^)  ^ Val(A2uA,SQ)  . 
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Proof:  Since  A^  and  A2  are  acceptable  assignments,  we  can 

always  reach  acceptable  columns  via  A^  and  A2 . There  are  two 
cases  to  consider: 

1)  Assume  that  some  acceptable  column  C is  reachable 
via  both  A^  and  A^.  There  are  2 log  n variables  which 

do  not  appear  in  C.  Half  of  them  are  u's  which  appear  on  the 
path  from  the  root  to  the  column.  The  other  half  consists  of 
v's.  A^  and  A2  cannot  differ  on  the  variables  on  the  path  from 
the  root  to  C since  C is  reachable  via  both  A^  and  A2 . Thus 
even  if  A^  and  A0  differ  on  all  the  log  n u's  missing  from 
column  C,  there  is  at  least  one  variable,  y^eY,  which  appears 
in  an  equality  of  C on  which  A^  and  A2  differ.  (The  variable 
y^  may  be  either  a u or  a v,  we  don't  care  which.)  The  equality 
in  which  y^  appears  must  be  of  the  form  y^=z^,  z^eZ  since  the 
column  is  acceptable,  that  is,  the  column  has  no  equality 
between  two  y's.  Since  Sq  is  free,  does  not  appear  on 
the  path  from  the  root  to  C.  Hence  we  can  find  an  assignment 
A to  the  variables  in  Z such  that  A^uA  and  A2uA  both  specify 
C and  AjU A satisfies  all  equations  in  C.  However,  A(z^)  = 

Al^yi^  ^ A2^yi*  S°  Val(AiUA,SQ)  = 1 and  Val(A2uA,SQ)  = 0. 

2)  Assume  that  there  is  no  acceptable  column  C which  is 
reachable  via  both  A^  and  A2 . We  first  find  a partial 
assignment  A to  the  variables  in  Z such  that  A^uA  specifies 

a column  which  can  be  satisfied  by  some  extension,  A',  of 
A^uA.  Then  we  show  that  we  can  choose  the  extension  A'  such 
that  it  satisfies  the  cloumn  specified  by  (A^uA)  but  the 
column  specified  by  (A2uA) uA'  is  not  satisfiable. 
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Let  be  an  acceptable  column  reachable  via  A^  and  let  A 

be  the  minimal  partial  assignment  such  that  A^uA  specifies 

and  all  equations  in  involving  variables  in  Y are  satisfied 

(this  is  always  possible  since  A^  is  acceptable,  SQ  is  free  and 

appears  in  C^) . A is  now  defined  for  at  most  |Y|+log  n = 

n variables.  Perform  the  following  step  while  A£uA 

does  not  specify  some  column:  let  be  the  label  of  the  last 

node  specified  by  A uA.  Extend  A by  setting  z to  be  false,  and 

2 K 

if  z =z  appears  in  C,  , extend  A to  set  z to  false.  (Setting 
k e 1 e 

z,  and  z to  true  would  work  equally  well.)  This  process 

K 6 

terminates  after  adding  at  most  2 log  n variables  to  A,  after 

which  A^uA  specifies  some  column  C2  (C  2 is  not  necessarily 

acceptable).  Note  that  all  equalities  in  C1  involving  variables 

in  A^uA  are  still  satisfied.  There  are  at  least  (n-log  n-|A|)/2  = 

(n-log  n- Vn/2  - 3 log  n)/2  equalities  in  C1  all  of  whose  variables 

are  unassigned  by  A^uA.  There  are  only  2 log  n variables  not 

appearing  in  C~,  thus  there  is  a z.=z.  in  C1 , z . and  z.  not 
z 1 3 1 1 - 3 

assigned  in  AnuA,  and  z . =x  , some  x , is  in  C,.  x is  not  z. 

jl  1 e e z e 3 

by  the  construction  of  S^.  Now  by  extending  A so  that  all  equalities 
in  are  satisfied,  and  A(z^)  = A(z^)  ^ (A2UA) (x^) , we  can 


no  y . =y . 
J-1  *3 

v\/2+log 


ensure  that  A^uA  satisfies  whereas  A^uA  does  not  satisfy  . 
This  completes  the  proof  of  the  lemma. 
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Before  we  can  show  that  there  are  many  acceptable  assignments 
which  differ  by  more  than  log  n of  the  variables  we  prove  the 
following  lemma  which  states  that  the  total  number  of  acceptable 
assignments  is  big. 

Lemma  4.2:  Let  S be  a B-scheme  whose  graph  is  a complete  binary 
tree,  with  2 -1  interior  nodes  labeled  with  variables 

u1,...,u2k_1  and  2 leaves  labeled  over  {0,1}.  Let  M be  any  subset 

of  the  variables  of  size  m and  let  the  number  of  leaves  labeled  1 

be  g.  Call  an  assignment  to  the  variables  in  M acceptable  if 

a leaf  labeled  1 is  reachable  from  it,  and  denote  by  A(m,g,k)  the 

m v 

number  of  acceptable  assignments.  Then  A(m,g,k)>2  g/2  . 

Proof : The  proof  is  by  induction  on  k,  the  height  of  the  tree. 

Basis : The  result  is  immediate  for  k=0. 

Induction  step;  Assume  that  A(m,g,k-1)  > 2mg/2k  ^ and  consider 
complete  binary  trees  with  2 leaves.  Let  the  number  of  leaves 
labeled  1 in  the  left  subtree  be  g^  and  in  the  right  subtree 
gr*  Let  the  number  of  variables  from  M in  the  left  subtree  be 
£ and  in  the  right  subtree  r.  There  are  two  cases  to  consider. 

1)  The  root  is  not  labeled  with  a variable  in  M,  hence 
£+r  = m.  Now 

A(m,g,k)  = 2*'A(r,gr  ,k-l)  + 2rA{£,ga,k-l) 

- A(r,gr,k-1)  A ( £ , g^ ,k-l) 
and  using  the  inductive  hypothesis 
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A (m, g ,k)  > 2Z  (2rgr/2k"1)  + 2r  (2Zg  jL/2k~1) 

- (2rgr/2k-1)  (2*g£/2k-1) 

= 2£+r[(gjl+gr)/2k"1-  g^g/22  (k_1)  ] 

= 2m[g/2k  + g/2k  - g£gr/22 (k_1 } ] 

^ _m  . k -k-1 

> 2 g/2  as  g^  gr  < 2 

2)  The  root  is  labeled  with  a variable  from  M.  Then 
£+r+l  = m and 

A(m,g,k)  = 2J'A(r,gr  ,k-l)  + 2rA  ( Z , g^  ,k-l) 

~£,_r  ,-k-l.  , „ r it  .k-1. 

> 2 (2  gr/2  ) + 2 (2  g^/2  ) 

_£+r . . . /-k-1 

=2  (g  + g )/2 


Now  we  can  prove  that  any  ordered  scheme  equivalent  to 
must  be  big. 

Theorem  4.3;  Let  be  an  ordered  B-scheme  which  is  equivalent 
to  SQ . Then 

18,1  ^ 2m“  (log2n+1)/2  uhere  m , yn^ 

Proof ; From  the  discussion  preceding  Lemma  4.1  we  know  that  Sq 
contains  at  least  n/2  acceptable  columns.  Since  Y contains  m 
variables  there  are  at  least  A(m,n/2,log  n)  acceptable  assignments 
to  variables  in  Y.  From  Lemma  4.1  we  know  that  if  two  of  these 
assignments  differ  by  more  than  log  n of  the  variables  then 
they  must  lead  to  two  different  nodes  in  S,  • Now  there  are  at 
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most  (™)  assignments  to  m variables  which  differ  from  a given 


assignment  in  i variable  values.  Hence  there  can  be  at  most 
log  n log  n 


log  n+1 
m ^ 


assignments  which  differ  from  a 


i=0  i=0 


given  assignment  by  at  most  log  n variables.  Therefore,  there 

are  at  least  A(m,n/2,log  n)/m^°^  n+^  acceptable  assignments  which 

differ  by  more  than  log  n variables  and  hence  |s^|  > 

A (m,  n/2  , log-  n)  /m^°^  n+^.  By  lemma  4.2  we  now  get 

|SJ  2 (2m* (n/2) /2log  n|/mlog  n+1 
(log  n+1)  log  m 

_ 2m-l-  (log  n+1)  (log  n-l)/2  (recall  that  m = Vn/2  ) 

1 2 

2m-^(l°g  n+1) 

and  the  theorem  is  proved.  ■ 


5 . Extension  to  single  variable  program  schemes 

In  this  section  we  show  that  the  equivalence  problem  for 
free  single  variable  program  schemes  (free  Ianov  schemes)  is 
polynomial  time  equivalent  to  the  equivalence  problem  for  free 
B-schemes. 

A single  variable  program  scheme  (an  I-scheme)  is  a rooted 
directed  graph  (not  necessarily  acyclic)  whose  nodes  have 
outdegree  0,1  or  2.  Nodes  with  outdegree  2 are  tests  and  are 
labeled  with  Boolean  variables.  Nodes  with  outdegree  0 and  1 are 
called  function  nodes  and  are  labeled  with  function  symbols. 

Only  vertices  with  outdegree  0 may  be  labeled  with  fi.  Edges 


~V 
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leaving  tests  are  labeled  with  T and  F as  in  B-schemes.  An 
I-scheme  is  free  if  every  B-scheme  which  is  a subgraph  is  free. 

We  shall  only  be  interested  in  the  behaviour  of  our  schemes 
under  Herbrand  interpretations  (free  interpretations  [G] ) where 
the  values  of  the  Boolean  variables  can  change  after  each  function 
step.  We  extend  the  notion  of  B-assignments  in  the  following  way. 
Let  F be  a set  of  function  symbols.  An  I-assignment  A maps 
elements  from  (F-{Q})*  into  B-assignments.  The  interpretation 
of  A(w)  is  the  mapping  defining  the  values  of  the  Boolean 
variables  in  state  w (the  state  after  computing  the  functions  in  w) . 
The  path  determined  by  A in  S is  the  obvious  generalization  of 
the  trace  t(A)  defined  for  B-schemes. 

The  proof  that  we  can  determine  equivalence  of  free  I-schemes 
in  polynomial  time  given  an  oracle  for  equivalence  of  free 
B-schemes  uses  a procedure  which  is  very  similar  to  the  minimi- 
zation procedure  for  deterministic  finite  automata  on  p.  124-127 
in  [AU]  . 

Let  F be  a set  of  function  symbols,  and  denote  by  (F-{Q})* 
the  set  of  all  strings  over  F-{fi}  of  length  k or  less.  A 
k-assignment  is  defined  as  a I-assignment  except  that  its 
domain  is  (F-{J2})*^  rather  than  (F-{fi})*. 

The  path  label  p£(S,A)  for  I-scheme  S and  k-assignment 
A,  is  the  string  of  function  symbols  appearing  along  the  path 
determined  by  A.  (The  string  may  be  of  length  less  than  k if 
the  path  reaches  a leaf.)  Let  function  nodes  n-^  and  n ^ appear  in 
S,  and  let  and  S2  be  the  (sub) -schemes  with  n^  and  n2  as  roots. 
Then  n^  is  k-equivalent  to  n2  if  for  each  k-assignment  A,  pMS^,A)  = 
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pi,  ( S 2 » A)  . Thus  for  example  two  function  nodes  are  O-equivalent 
iff  they  have  the  same  label. 

The  next  lemma,  the  proof  of  which  we  leave  to  the  reader, 
states  that  k-equivalence  can  be  determined  from  (k-1 ) -equivalence 
and  some  equivalence  tests  on  B-schemes. 

Lemma  5.1:  Let  S be  a free  I-scheme  with  function  nodes  n^  and 
n2»  Let  v^  be  the  B-scheme  whose  root  is  the  descendant  of  n^, 
i=l  or  2 (v^  may  be  simply  a function  node) . Label  each  leaf  i 
in  v^  by  its  equivalence  class  in  the  (k-1) -equivalence 

relation.  Then  n^^  and  n2  are  k-equivalent  if  and  only  if  n-j^ 
and  n2  are  (k-1) -equivalent  and  v^Hv2'  where  the  last  equivalence 
is  of  B-schemes.  ^ 

Theorem  5.2;  Let  S be  a free  I-scheme  with  t nodes.  Given  an 
oracle  for  determining  equivalence  of  free  B-schemes,  there  is 
a polynomial  time  algorithm  for  determining  if  two  fu  ction  nodes 
in  S are  k-equivalent  for  all  k. 

Proof : It  follows  trivially  from  the  preceeding  lemma  that  two 

nodes  are  k-equivalent  for  all  k if  and  only  if  they  are 

t-equivalent . Since  O-equivalence  is  easy  to  determine  (the 

nodes  must  have  the  same  label),  we  can  use  Lemma  5.1  to  compute 

2 

k-equivalence  for  k = l,2,...,t.  At  most  t B-scheme  tests  are 

, 3 

made  for  each  value  of  k,  hence  at  most  t B-scheme  tests  are 

made  altogether.  ■ 


Having  shown  how  to  handle  k-equivalence  for  all  k we  now 
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define  what  it  means  for  two  I-schemes  to  be  equivalent. 

Let  S be  an  I-scheme  and  A an  I-assignment  (i.e.  A maps 
elements  from  (F-{ft))*  to  B-assignment)  . The  value  mapping 
Val  is  defined  as  follows. 

r the  function  symbols  on  the  path  determined 
by  A if  the  path  is  finite  and  does 


Val (S, A)  = 


not  end  in  ft 
ft  otherwise 


V 

Two  I-schemes  and  S 2 are  equivalent  if  Val(S^,A)  = Val(S2,A)  for 
all  I-assignments  A.  It  is  clear  that  this  definition  means 
equivalence  under  all  Herbrand  interpretations  (free  interpretations) 
and  it  is  well  known  that  this  implies  equivalence  under  all 
interpretations  [G] . 

We  would  like  to  show  that  two  schemes  are  equivalent  iff 
their  root  nodes  are  k-equivalent  for  all  k.  Unfortunately  this 
is  not  quite  true;  the  problem  is  that  the  schemes  may  both 
compute  ft  but  do  so  in  different  ways. 

A free  I-scheme  is  compact  if  from  every  non-leaf  node  there 
is  a path  to  a leaf  not  labeled  ft. 


Lemma  5.3:  There  is  a polynomial  time  algorithm  to  transform 
any  free  I-scheme  into  an  equivalent  compact  free  scheme. 

Proof : Immediate.  ■ 

Lemma  5.4:  Two  free  compact  I-schemes  and  S2  are  equivalent 
iff  their  roots  and  n2  are  k-equivalent  for  every  k. 
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Proof:  It  is  clear  that  if  and  are  k-equivalent  for  all 

k,  then  is  equivalent  to  S^-  Conversely,  suppose  equivalen 

to  S2  and  let  k be  the  smallest  value  for  which  there  is  a 

k-assignment  A such  that  p!(S^,A)  f p& (S 2'A) . Not  both  of 
p£(S^,A)  and  pJlfS^/A)  can  end  in  so  assume  p!l(S^,A)  does  not. 

We  can  extend  A to  an  ^-assignment  A’,  £>k  with  A*  (w)  = A(w) 
for  all  w,  |w|<k,  such  that  A1  defines  a path  to  a leaf  not 
labeled  JJ  in  S^.  Now  since  the  k*"^  symbol  on  the  path  defined 
by  A'  in  is  different  from  the  kth  symbol  on  the  path  ir.  S^, 
and  Val(S^,A')  ^ we  must  have  not  equivalent  to  S^,  a 
contradiction.  ■ 

Now  the  following  theorem  is  an  immediate  corollary  of 
the  preceding  lemmas. 

Theorem  5.5:  There  is  a polynomial  time  algorithm  to  decide 
equivalence  of  free  I-schemes  if  and  only  if  there  is  a polynomial 
time  algorithm  to  decide  equivalence  of  free  B-schemes.  g 

We  close  this  section  with  the  remark  that  non-inclusion 
for  I-schemes  is  NP-complete.  Inclusion  for  I-schemes  is 
defined  exactly  as  for  B-schemes  with  " I-assignment"  replacing 
"B-assignment" . That  the  problem  is  NP-hard  is  clear  from 
Theorem  3.1.  That  it  is  in  NP  is  shown  in  [CHS]. 
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